Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 4898 |
| Missing cells | 5376 |
| Missing cells (%) | 9.1% |
| Duplicate rows | 118 |
| Duplicate rows (%) | 2.4% |
| Total size in memory | 459.3 KiB |
| Average record size in memory | 96.0 B |
Variable types
| NUM | 11 |
|---|---|
| CAT | 1 |
| Dataset has 118 (2.4%) duplicate rows | Duplicates |
fixed acidity has 498 (10.2%) missing values | Missing |
volatile acidity has 467 (9.5%) missing values | Missing |
citric acid has 487 (9.9%) missing values | Missing |
residual sugar has 464 (9.5%) missing values | Missing |
chlorides has 511 (10.4%) missing values | Missing |
free sulfur dioxide has 500 (10.2%) missing values | Missing |
total sulfur dioxide has 469 (9.6%) missing values | Missing |
density has 485 (9.9%) missing values | Missing |
pH has 483 (9.9%) missing values | Missing |
sulphates has 526 (10.7%) missing values | Missing |
alcohol has 486 (9.9%) missing values | Missing |
Reproduction
| Analysis started | 2020-10-14 11:42:20.481503 |
|---|---|
| Analysis finished | 2020-10-14 11:42:39.627485 |
| Duration | 19.15 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 67 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 498 |
| Missing (%) | 10.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.849079545 |
|---|---|
| Minimum | 3.8 |
| Maximum | 11.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 3.8 |
|---|---|
| 5-th percentile | 5.6 |
| Q1 | 6.3 |
| median | 6.8 |
| Q3 | 7.3 |
| 95-th percentile | 8.3 |
| Maximum | 11.8 |
| Range | 8 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.8358749443 |
|---|---|
| Coefficient of variation (CV) | 0.1220419384 |
| Kurtosis | 1.113152139 |
| Mean | 6.849079545 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 0.5085564094 |
| Sum | 30135.95 |
| Variance | 0.6986869225 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 6.8 | 276 | 5.6% | |
| 6.6 | 262 | 5.3% | |
| 6.4 | 252 | 5.1% | |
| 6.7 | 215 | 4.4% | |
| 6.9 | 212 | 4.3% | |
| 7 | 211 | 4.3% | |
| 6.5 | 210 | 4.3% | |
| 7.2 | 190 | 3.9% | |
| 7.4 | 177 | 3.6% | |
| 6.2 | 176 | 3.6% | |
| Other values (57) | 2219 | 45.3% | |
| (Missing) | 498 | 10.2% |
| Value | Count | Frequency (%) | |
| 3.8 | 1 | < 0.1% | |
| 3.9 | 1 | < 0.1% | |
| 4.2 | 2 | < 0.1% | |
| 4.4 | 3 | 0.1% | |
| 4.5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 11.8 | 1 | < 0.1% | |
| 10.7 | 1 | < 0.1% | |
| 10.3 | 2 | < 0.1% | |
| 10.2 | 1 | < 0.1% | |
| 10 | 2 | < 0.1% |
| Distinct | 119 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 467 |
| Missing (%) | 9.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2778571429 |
|---|---|
| Minimum | 0.08 |
| Maximum | 1.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 0.08 |
|---|---|
| 5-th percentile | 0.15 |
| Q1 | 0.21 |
| median | 0.26 |
| Q3 | 0.32 |
| 95-th percentile | 0.46 |
| Maximum | 1.1 |
| Range | 1.02 |
| Interquartile range (IQR) | 0.11 |
Descriptive statistics
| Standard deviation | 0.1001514976 |
|---|---|
| Coefficient of variation (CV) | 0.3604424079 |
| Kurtosis | 5.177265027 |
| Mean | 0.2778571429 |
| Median Absolute Deviation (MAD) | 0.06 |
| Skewness | 1.568396912 |
| Sum | 1231.185 |
| Variance | 0.01003032248 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0.28 | 232 | 4.7% | |
| 0.24 | 229 | 4.7% | |
| 0.26 | 221 | 4.5% | |
| 0.22 | 213 | 4.3% | |
| 0.25 | 213 | 4.3% | |
| 0.27 | 197 | 4.0% | |
| 0.23 | 194 | 4.0% | |
| 0.2 | 194 | 4.0% | |
| 0.3 | 181 | 3.7% | |
| 0.21 | 169 | 3.5% | |
| Other values (109) | 2388 | 48.8% | |
| (Missing) | 467 | 9.5% |
| Value | Count | Frequency (%) | |
| 0.08 | 4 | 0.1% | |
| 0.085 | 1 | < 0.1% | |
| 0.1 | 5 | 0.1% | |
| 0.105 | 6 | 0.1% | |
| 0.11 | 12 | 0.2% |
| Value | Count | Frequency (%) | |
| 1.1 | 1 | < 0.1% | |
| 1.005 | 1 | < 0.1% | |
| 0.965 | 1 | < 0.1% | |
| 0.93 | 1 | < 0.1% | |
| 0.905 | 1 | < 0.1% |
| Distinct | 87 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 487 |
| Missing (%) | 9.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.3346701428 |
|---|---|
| Minimum | 0 |
| Maximum | 1.66 |
| Zeros | 17 |
| Zeros (%) | 0.3% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.17 |
| Q1 | 0.27 |
| median | 0.32 |
| Q3 | 0.39 |
| 95-th percentile | 0.54 |
| Maximum | 1.66 |
| Range | 1.66 |
| Interquartile range (IQR) | 0.12 |
Descriptive statistics
| Standard deviation | 0.1222739898 |
|---|---|
| Coefficient of variation (CV) | 0.3653567324 |
| Kurtosis | 6.46138061 |
| Mean | 0.3346701428 |
| Median Absolute Deviation (MAD) | 0.06 |
| Skewness | 1.336434192 |
| Sum | 1476.23 |
| Variance | 0.01495092858 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0.3 | 277 | 5.7% | |
| 0.28 | 250 | 5.1% | |
| 0.32 | 234 | 4.8% | |
| 0.34 | 202 | 4.1% | |
| 0.29 | 198 | 4.0% | |
| 0.26 | 196 | 4.0% | |
| 0.27 | 193 | 3.9% | |
| 0.49 | 189 | 3.9% | |
| 0.31 | 177 | 3.6% | |
| 0.24 | 165 | 3.4% | |
| Other values (77) | 2330 | 47.6% | |
| (Missing) | 487 | 9.9% |
| Value | Count | Frequency (%) | |
| 0 | 17 | 0.3% | |
| 0.01 | 7 | 0.1% | |
| 0.02 | 4 | 0.1% | |
| 0.03 | 2 | < 0.1% | |
| 0.04 | 12 | 0.2% |
| Value | Count | Frequency (%) | |
| 1.66 | 1 | < 0.1% | |
| 1.23 | 1 | < 0.1% | |
| 1 | 5 | 0.1% | |
| 0.99 | 1 | < 0.1% | |
| 0.91 | 2 | < 0.1% |
| Distinct | 304 |
|---|---|
| Distinct (%) | 6.9% |
| Missing | 464 |
| Missing (%) | 9.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.393222824 |
|---|---|
| Minimum | 0.6 |
| Maximum | 65.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 0.6 |
|---|---|
| 5-th percentile | 1.1 |
| Q1 | 1.7 |
| median | 5.2 |
| Q3 | 9.85 |
| 95-th percentile | 15.8 |
| Maximum | 65.8 |
| Range | 65.2 |
| Interquartile range (IQR) | 8.15 |
Descriptive statistics
| Standard deviation | 5.086484589 |
|---|---|
| Coefficient of variation (CV) | 0.7956057109 |
| Kurtosis | 3.852947643 |
| Mean | 6.393222824 |
| Median Absolute Deviation (MAD) | 3.6 |
| Skewness | 1.117488511 |
| Sum | 28347.55 |
| Variance | 25.87232548 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1.2 | 172 | 3.5% | |
| 1.4 | 163 | 3.3% | |
| 1.6 | 146 | 3.0% | |
| 1.3 | 136 | 2.8% | |
| 1.1 | 135 | 2.8% | |
| 1.5 | 125 | 2.6% | |
| 1.8 | 93 | 1.9% | |
| 1.7 | 88 | 1.8% | |
| 1 | 80 | 1.6% | |
| 2 | 71 | 1.4% | |
| Other values (294) | 3225 | 65.8% | |
| (Missing) | 464 | 9.5% |
| Value | Count | Frequency (%) | |
| 0.6 | 2 | < 0.1% | |
| 0.7 | 5 | 0.1% | |
| 0.8 | 23 | 0.5% | |
| 0.9 | 36 | 0.7% | |
| 0.95 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 65.8 | 1 | < 0.1% | |
| 31.6 | 2 | < 0.1% | |
| 26.05 | 2 | < 0.1% | |
| 23.5 | 1 | < 0.1% | |
| 22.6 | 1 | < 0.1% |
| Distinct | 157 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 511 |
| Missing (%) | 10.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.04601800775 |
|---|---|
| Minimum | 0.009 |
| Maximum | 0.346 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 0.009 |
|---|---|
| 5-th percentile | 0.027 |
| Q1 | 0.036 |
| median | 0.043 |
| Q3 | 0.05 |
| 95-th percentile | 0.0677 |
| Maximum | 0.346 |
| Range | 0.337 |
| Interquartile range (IQR) | 0.014 |
Descriptive statistics
| Standard deviation | 0.02239629583 |
|---|---|
| Coefficient of variation (CV) | 0.4866854722 |
| Kurtosis | 37.06569 |
| Mean | 0.04601800775 |
| Median Absolute Deviation (MAD) | 0.007 |
| Skewness | 5.034741483 |
| Sum | 201.881 |
| Variance | 0.0005015940669 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0.044 | 176 | 3.6% | |
| 0.036 | 173 | 3.5% | |
| 0.046 | 162 | 3.3% | |
| 0.048 | 159 | 3.2% | |
| 0.04 | 159 | 3.2% | |
| 0.045 | 157 | 3.2% | |
| 0.042 | 155 | 3.2% | |
| 0.047 | 155 | 3.2% | |
| 0.034 | 152 | 3.1% | |
| 0.038 | 149 | 3.0% | |
| Other values (147) | 2790 | 57.0% | |
| (Missing) | 511 | 10.4% |
| Value | Count | Frequency (%) | |
| 0.009 | 1 | < 0.1% | |
| 0.012 | 1 | < 0.1% | |
| 0.014 | 3 | 0.1% | |
| 0.015 | 4 | 0.1% | |
| 0.016 | 5 | 0.1% |
| Value | Count | Frequency (%) | |
| 0.346 | 1 | < 0.1% | |
| 0.301 | 1 | < 0.1% | |
| 0.29 | 1 | < 0.1% | |
| 0.271 | 1 | < 0.1% | |
| 0.255 | 1 | < 0.1% |
| Distinct | 129 |
|---|---|
| Distinct (%) | 2.9% |
| Missing | 500 |
| Missing (%) | 10.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.39483856 |
|---|---|
| Minimum | 2 |
| Maximum | 289 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 11 |
| Q1 | 23 |
| median | 34 |
| Q3 | 46 |
| 95-th percentile | 63 |
| Maximum | 289 |
| Range | 287 |
| Interquartile range (IQR) | 23 |
Descriptive statistics
| Standard deviation | 17.09292196 |
|---|---|
| Coefficient of variation (CV) | 0.4829213143 |
| Kurtosis | 12.17592225 |
| Mean | 35.39483856 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 1.448020763 |
| Sum | 155666.5 |
| Variance | 292.1679811 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 29 | 137 | 2.8% | |
| 26 | 117 | 2.4% | |
| 35 | 116 | 2.4% | |
| 34 | 115 | 2.3% | |
| 31 | 115 | 2.3% | |
| 36 | 108 | 2.2% | |
| 24 | 106 | 2.2% | |
| 33 | 101 | 2.1% | |
| 25 | 100 | 2.0% | |
| 37 | 100 | 2.0% | |
| Other values (119) | 3283 | 67.0% | |
| (Missing) | 500 | 10.2% |
| Value | Count | Frequency (%) | |
| 2 | 1 | < 0.1% | |
| 3 | 9 | 0.2% | |
| 4 | 9 | 0.2% | |
| 5 | 19 | 0.4% | |
| 6 | 26 | 0.5% |
| Value | Count | Frequency (%) | |
| 289 | 1 | < 0.1% | |
| 146.5 | 1 | < 0.1% | |
| 131 | 1 | < 0.1% | |
| 128 | 1 | < 0.1% | |
| 124 | 1 | < 0.1% |
| Distinct | 247 |
|---|---|
| Distinct (%) | 5.6% |
| Missing | 469 |
| Missing (%) | 9.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 137.7562655 |
|---|---|
| Minimum | 9 |
| Maximum | 440 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 9 |
|---|---|
| 5-th percentile | 75 |
| Q1 | 108 |
| median | 134 |
| Q3 | 166 |
| 95-th percentile | 210 |
| Maximum | 440 |
| Range | 431 |
| Interquartile range (IQR) | 58 |
Descriptive statistics
| Standard deviation | 42.07877001 |
|---|---|
| Coefficient of variation (CV) | 0.3054581209 |
| Kurtosis | 0.5829306493 |
| Mean | 137.7562655 |
| Median Absolute Deviation (MAD) | 29 |
| Skewness | 0.3837512055 |
| Sum | 610122.5 |
| Variance | 1770.622886 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 111 | 60 | 1.2% | |
| 117 | 53 | 1.1% | |
| 150 | 53 | 1.1% | |
| 122 | 51 | 1.0% | |
| 118 | 50 | 1.0% | |
| 140 | 50 | 1.0% | |
| 128 | 50 | 1.0% | |
| 113 | 49 | 1.0% | |
| 114 | 48 | 1.0% | |
| 156 | 47 | 1.0% | |
| Other values (237) | 3918 | 80.0% | |
| (Missing) | 469 | 9.6% |
| Value | Count | Frequency (%) | |
| 9 | 1 | < 0.1% | |
| 18 | 2 | < 0.1% | |
| 19 | 1 | < 0.1% | |
| 21 | 1 | < 0.1% | |
| 24 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 440 | 1 | < 0.1% | |
| 366.5 | 1 | < 0.1% | |
| 307.5 | 1 | < 0.1% | |
| 303 | 1 | < 0.1% | |
| 294 | 1 | < 0.1% |
| Distinct | 869 |
|---|---|
| Distinct (%) | 19.7% |
| Missing | 485 |
| Missing (%) | 9.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.9940384194 |
|---|---|
| Minimum | 0.98711 |
| Maximum | 1.03898 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 0.98711 |
|---|---|
| 5-th percentile | 0.98963 |
| Q1 | 0.99174 |
| median | 0.9938 |
| Q3 | 0.99612 |
| 95-th percentile | 0.999008 |
| Maximum | 1.03898 |
| Range | 0.05187 |
| Interquartile range (IQR) | 0.00438 |
Descriptive statistics
| Standard deviation | 0.003013857396 |
|---|---|
| Coefficient of variation (CV) | 0.003031932505 |
| Kurtosis | 10.59304292 |
| Mean | 0.9940384194 |
| Median Absolute Deviation (MAD) | 0.00215 |
| Skewness | 1.040244148 |
| Sum | 4386.691545 |
| Variance | 9.083336401e-06 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0.992 | 57 | 1.2% | |
| 0.9928 | 55 | 1.1% | |
| 0.9932 | 50 | 1.0% | |
| 0.993 | 47 | 1.0% | |
| 0.9934 | 47 | 1.0% | |
| 0.9938 | 45 | 0.9% | |
| 0.9944 | 42 | 0.9% | |
| 0.9927 | 41 | 0.8% | |
| 0.9948 | 40 | 0.8% | |
| 0.9924 | 40 | 0.8% | |
| Other values (859) | 3949 | 80.6% | |
| (Missing) | 485 | 9.9% |
| Value | Count | Frequency (%) | |
| 0.98711 | 1 | < 0.1% | |
| 0.98713 | 1 | < 0.1% | |
| 0.98722 | 1 | < 0.1% | |
| 0.9874 | 1 | < 0.1% | |
| 0.98742 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1.03898 | 1 | < 0.1% | |
| 1.0103 | 2 | < 0.1% | |
| 1.00295 | 2 | < 0.1% | |
| 1.00241 | 1 | < 0.1% | |
| 1.0024 | 1 | < 0.1% |
| Distinct | 101 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 483 |
| Missing (%) | 9.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.188061155 |
|---|---|
| Minimum | 2.74 |
| Maximum | 3.82 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 2.74 |
|---|---|
| 5-th percentile | 2.96 |
| Q1 | 3.08 |
| median | 3.18 |
| Q3 | 3.28 |
| 95-th percentile | 3.46 |
| Maximum | 3.82 |
| Range | 1.08 |
| Interquartile range (IQR) | 0.2 |
Descriptive statistics
| Standard deviation | 0.1516803655 |
|---|---|
| Coefficient of variation (CV) | 0.04757762104 |
| Kurtosis | 0.5643898337 |
| Mean | 3.188061155 |
| Median Absolute Deviation (MAD) | 0.1 |
| Skewness | 0.483780226 |
| Sum | 14075.29 |
| Variance | 0.02300693328 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 3.14 | 156 | 3.2% | |
| 3.16 | 151 | 3.1% | |
| 3.22 | 135 | 2.8% | |
| 3.19 | 132 | 2.7% | |
| 3.24 | 124 | 2.5% | |
| 3.08 | 124 | 2.5% | |
| 3.15 | 122 | 2.5% | |
| 3.18 | 121 | 2.5% | |
| 3.12 | 120 | 2.4% | |
| 3.2 | 120 | 2.4% | |
| Other values (91) | 3110 | 63.5% | |
| (Missing) | 483 | 9.9% |
| Value | Count | Frequency (%) | |
| 2.74 | 1 | < 0.1% | |
| 2.77 | 1 | < 0.1% | |
| 2.79 | 2 | < 0.1% | |
| 2.8 | 3 | 0.1% | |
| 2.83 | 4 | 0.1% |
| Value | Count | Frequency (%) | |
| 3.82 | 1 | < 0.1% | |
| 3.81 | 1 | < 0.1% | |
| 3.8 | 2 | < 0.1% | |
| 3.79 | 1 | < 0.1% | |
| 3.77 | 2 | < 0.1% |
| Distinct | 78 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 526 |
| Missing (%) | 10.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.489878774 |
|---|---|
| Minimum | 0.22 |
| Maximum | 1.08 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 0.22 |
|---|---|
| 5-th percentile | 0.34 |
| Q1 | 0.41 |
| median | 0.47 |
| Q3 | 0.55 |
| 95-th percentile | 0.7045 |
| Maximum | 1.08 |
| Range | 0.86 |
| Interquartile range (IQR) | 0.14 |
Descriptive statistics
| Standard deviation | 0.1143431759 |
|---|---|
| Coefficient of variation (CV) | 0.2334111661 |
| Kurtosis | 1.642538891 |
| Mean | 0.489878774 |
| Median Absolute Deviation (MAD) | 0.07 |
| Skewness | 0.9843199583 |
| Sum | 2141.75 |
| Variance | 0.01307436187 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0.5 | 228 | 4.7% | |
| 0.46 | 205 | 4.2% | |
| 0.44 | 195 | 4.0% | |
| 0.38 | 193 | 3.9% | |
| 0.42 | 162 | 3.3% | |
| 0.47 | 156 | 3.2% | |
| 0.49 | 153 | 3.1% | |
| 0.54 | 153 | 3.1% | |
| 0.4 | 152 | 3.1% | |
| 0.48 | 151 | 3.1% | |
| Other values (68) | 2624 | 53.6% | |
| (Missing) | 526 | 10.7% |
| Value | Count | Frequency (%) | |
| 0.22 | 1 | < 0.1% | |
| 0.23 | 1 | < 0.1% | |
| 0.25 | 4 | 0.1% | |
| 0.26 | 4 | 0.1% | |
| 0.27 | 11 | 0.2% |
| Value | Count | Frequency (%) | |
| 1.08 | 1 | < 0.1% | |
| 1.06 | 1 | < 0.1% | |
| 1.01 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 0.99 | 1 | < 0.1% |
| Distinct | 102 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 486 |
| Missing (%) | 9.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.51835071 |
|---|---|
| Minimum | 8 |
| Maximum | 14.2 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.3 KiB |
Quantile statistics
| Minimum | 8 |
|---|---|
| 5-th percentile | 8.9 |
| Q1 | 9.5 |
| median | 10.4 |
| Q3 | 11.4 |
| 95-th percentile | 12.7 |
| Maximum | 14.2 |
| Range | 6.2 |
| Interquartile range (IQR) | 1.9 |
Descriptive statistics
| Standard deviation | 1.234729584 |
|---|---|
| Coefficient of variation (CV) | 0.117388136 |
| Kurtosis | -0.7095914483 |
| Mean | 10.51835071 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.4852554398 |
| Sum | 46406.96333 |
| Variance | 1.524557145 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 9.4 | 202 | 4.1% | |
| 9.5 | 201 | 4.1% | |
| 9.2 | 189 | 3.9% | |
| 9 | 169 | 3.5% | |
| 11 | 146 | 3.0% | |
| 10.5 | 142 | 2.9% | |
| 10 | 142 | 2.9% | |
| 10.4 | 140 | 2.9% | |
| 9.1 | 133 | 2.7% | |
| 9.8 | 123 | 2.5% | |
| Other values (92) | 2825 | 57.7% | |
| (Missing) | 486 | 9.9% |
| Value | Count | Frequency (%) | |
| 8 | 2 | < 0.1% | |
| 8.4 | 3 | 0.1% | |
| 8.5 | 3 | 0.1% | |
| 8.6 | 20 | 0.4% | |
| 8.7 | 70 | 1.4% |
| Value | Count | Frequency (%) | |
| 14.2 | 1 | < 0.1% | |
| 14.05 | 1 | < 0.1% | |
| 14 | 5 | 0.1% | |
| 13.9 | 2 | < 0.1% | |
| 13.8 | 1 | < 0.1% |
quality
Categorical
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 38.3 KiB |
| 6.0 | |
|---|---|
| 5.0 | |
| 7.0 | |
| 8.0 | 175 |
| 4.0 | 163 |
| Other values (2) | 25 |
| Value | Count | Frequency (%) | |
| 6.0 | 2198 | 44.9% | |
| 5.0 | 1457 | 29.7% | |
| 7.0 | 880 | 18.0% | |
| 8.0 | 175 | 3.6% | |
| 4.0 | 163 | 3.3% | |
| 3.0 | 20 | 0.4% | |
| 9.0 | 5 | 0.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | NaN | 0.27 | 0.36 | 20.7 | 0.045 | 45.0 | 170.0 | 1.0010 | 3.00 | NaN | 8.8 | 6.0 |
| 1 | 6.3 | 0.30 | 0.34 | 1.6 | 0.049 | 14.0 | 132.0 | 0.9940 | 3.30 | 0.49 | 9.5 | 6.0 |
| 2 | 8.1 | 0.28 | 0.40 | 6.9 | 0.050 | 30.0 | 97.0 | 0.9951 | 3.26 | 0.44 | NaN | 6.0 |
| 3 | 7.2 | 0.23 | 0.32 | 8.5 | 0.058 | 47.0 | 186.0 | 0.9956 | 3.19 | NaN | 9.9 | 6.0 |
| 4 | 7.2 | NaN | 0.32 | 8.5 | 0.058 | 47.0 | 186.0 | 0.9956 | 3.19 | 0.40 | NaN | 6.0 |
| 5 | 8.1 | 0.28 | 0.40 | NaN | 0.050 | 30.0 | 97.0 | 0.9951 | 3.26 | 0.44 | 10.1 | 6.0 |
| 6 | 6.2 | 0.32 | 0.16 | 7.0 | 0.045 | 30.0 | 136.0 | 0.9949 | 3.18 | 0.47 | 9.6 | 6.0 |
| 7 | 7.0 | 0.27 | 0.36 | 20.7 | 0.045 | 45.0 | 170.0 | 1.0010 | 3.00 | 0.45 | 8.8 | 6.0 |
| 8 | 6.3 | 0.30 | 0.34 | 1.6 | 0.049 | NaN | 132.0 | 0.9940 | 3.30 | 0.49 | 9.5 | 6.0 |
| 9 | 8.1 | 0.22 | 0.43 | 1.5 | 0.044 | 28.0 | 129.0 | 0.9938 | 3.22 | 0.45 | 11.0 | 6.0 |
Last rows
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4888 | 6.8 | 0.220 | 0.36 | 1.20 | 0.052 | NaN | 127.0 | 0.99330 | 3.04 | 0.54 | 9.2 | 5.0 |
| 4889 | 4.9 | 0.235 | 0.27 | 11.75 | 0.030 | 34.0 | 118.0 | 0.99540 | 3.07 | 0.50 | 9.4 | 6.0 |
| 4890 | 6.1 | 0.340 | 0.29 | 2.20 | 0.036 | 25.0 | 100.0 | 0.98938 | 3.06 | 0.44 | NaN | 6.0 |
| 4891 | 5.7 | 0.210 | 0.32 | 0.90 | 0.038 | 38.0 | 121.0 | 0.99074 | NaN | NaN | 10.6 | 6.0 |
| 4892 | 6.5 | 0.230 | 0.38 | NaN | 0.032 | 29.0 | 112.0 | 0.99298 | 3.29 | 0.54 | 9.7 | 5.0 |
| 4893 | 6.2 | 0.210 | 0.29 | 1.60 | 0.039 | 24.0 | 92.0 | 0.99114 | 3.27 | 0.50 | NaN | 6.0 |
| 4894 | 6.6 | 0.320 | 0.36 | 8.00 | NaN | 57.0 | 168.0 | 0.99490 | 3.15 | 0.46 | 9.6 | 5.0 |
| 4895 | 6.5 | NaN | 0.19 | 1.20 | 0.041 | 30.0 | 111.0 | 0.99254 | 2.99 | 0.46 | 9.4 | 6.0 |
| 4896 | 5.5 | 0.290 | 0.30 | 1.10 | 0.022 | 20.0 | 110.0 | 0.98869 | 3.34 | 0.38 | 12.8 | 7.0 |
| 4897 | 6.0 | 0.210 | 0.38 | 0.80 | 0.020 | 22.0 | 98.0 | 0.98941 | 3.26 | 0.32 | 11.8 | 6.0 |
Most frequent
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 66 | 7.3 | 0.19 | 0.27 | 13.9 | 0.057 | 45.0 | 155.0 | 0.99807 | 2.94 | 0.41 | 8.8 | 8.0 | 5 |
| 71 | 7.4 | 0.16 | 0.30 | 13.7 | 0.056 | 33.0 | 168.0 | 0.99825 | 2.90 | 0.44 | 8.7 | 7.0 | 5 |
| 63 | 7.2 | 0.25 | 0.28 | 14.4 | 0.055 | 55.0 | 205.0 | 0.99860 | 3.12 | 0.38 | 9.0 | 7.0 | 4 |
| 77 | 7.6 | 0.20 | 0.30 | 14.2 | 0.056 | 53.0 | 212.5 | 0.99900 | 3.14 | 0.46 | 8.9 | 8.0 | 4 |
| 9 | 6.2 | 0.22 | 0.28 | 2.2 | 0.040 | 24.0 | 125.0 | 0.99170 | 3.19 | 0.48 | 10.5 | 6.0 | 3 |
| 14 | 6.3 | 0.13 | 0.42 | 1.1 | 0.043 | 63.0 | 146.0 | 0.99066 | 3.13 | 0.72 | 11.2 | 7.0 | 3 |
| 16 | 6.4 | 0.24 | 0.26 | 8.2 | 0.054 | 47.0 | 182.0 | 0.99538 | 3.12 | 0.50 | 9.5 | 5.0 | 3 |
| 20 | 6.5 | 0.18 | 0.41 | 14.2 | 0.039 | 47.0 | 129.0 | 0.99678 | 3.28 | 0.72 | 10.3 | 7.0 | 3 |
| 48 | 7.0 | 0.15 | 0.28 | 14.7 | 0.051 | 29.0 | 149.0 | 0.99792 | 2.96 | 0.39 | 9.0 | 7.0 | 3 |
| 82 | 7.7 | 0.30 | 0.42 | 14.3 | 0.045 | 45.0 | 213.0 | 0.99910 | 3.18 | 0.63 | 9.2 | 5.0 | 3 |